8 research outputs found
Self-Supervised Prediction of the Intention to Interact with a Service Robot
A service robot can provide a smoother interaction experience if it has the
ability to proactively detect whether a nearby user intends to interact, in
order to adapt its behavior e.g. by explicitly showing that it is available to
provide a service. In this work, we propose a learning-based approach to
predict the probability that a human user will interact with a robot before the
interaction actually begins; the approach is self-supervised because after each
encounter with a human, the robot can automatically label it depending on
whether it resulted in an interaction or not. We explore different
classification approaches, using different sets of features considering the
pose and the motion of the user. We validate and deploy the approach in three
scenarios. The first collects natural sequences (both interacting and
non-interacting) representing employees in an office break area: a real-world,
challenging setting, where we consider a coffee machine in place of a service
robot. The other two scenarios represent researchers interacting with service
robots ( and sequences, respectively). Results show that, even in
challenging real-world settings, our approach can learn without external
supervision, and can achieve accurate classification (i.e. AUROC greater than
) of the user's intention to interact with an advance of more than s
before the interaction actually occurs.Comment: Paper under revision for Robotics and Autonomous Systems journa
GROWL:Group Detection with Link Prediction
Interaction group detection has been previously addressed with bottom-up
approaches which relied on the position and orientation information of
individuals. These approaches were primarily based on pairwise affinity
matrices and were limited to static, third-person views. This problem can
greatly benefit from a holistic approach based on Graph Neural Networks (GNNs)
beyond pairwise relationships, due to the inherent spatial configuration that
exists between individuals who form interaction groups. Our proposed method,
GROup detection With Link prediction (GROWL), demonstrates the effectiveness of
a GNN based approach. GROWL predicts the link between two individuals by
generating a feature embedding based on their neighbourhood in the graph and
determines whether they are connected with a shallow binary classification
method such as Multi-layer Perceptrons (MLPs). We test our method against other
state-of-the-art group detection approaches on both a third-person view dataset
and a robocentric (i.e., egocentric) dataset. In addition, we propose a
multimodal approach based on RGB and depth data to calculate a representation
GROWL can utilise as input. Our results show that a GNN based approach can
significantly improve accuracy across different camera views, i.e.,
third-person and egocentric views